The opticlust algorithm yielded high quality OTUs based on the high precision metrics below.
| Parameter | Value |
|---|---|
| cutoff | 0.2 |
| Sensitivity | 0.998 |
| Specificity | 0.999 |
| PPV | 0.998 |
| NPV | 0.999 |
| Accuracy | 0.999 |
| MCC | 0.997 |
| F1score | 0.998 |
| FDR | 0.002 |
Phylum frequency percentage cumulative_perc
1 Firmicutes 42 66.67 66.67
2 Actinobacteria 5 7.94 74.61
3 Bacteroidetes 5 7.94 82.55
4 Proteobacteria 5 7.94 90.49
5 Tenericutes 2 3.17 93.66
6 Bacteria_unclassified 1 1.59 95.25
7 Deinococcus-Thermus 1 1.59 96.84
8 Patescibacteria 1 1.59 98.43
9 Verrucomicrobia 1 1.59 100.00
Class frequency percentage cumulative_perc
1 Clostridia 31 49.21 49.21
2 Bacilli 8 12.70 61.91
3 Bacteroidia 5 7.94 69.85
4 Gammaproteobacteria 5 7.94 77.79
5 Actinobacteria 2 3.17 80.96
6 Coriobacteriia 2 3.17 84.13
7 Erysipelotrichia 2 3.17 87.30
8 Mollicutes 2 3.17 90.47
9 Actinobacteria_unclassified 1 1.59 92.06
10 Bacteria_unclassified 1 1.59 93.65
11 Deinococci 1 1.59 95.24
12 Firmicutes_unclassified 1 1.59 96.83
13 Saccharimonadia 1 1.59 98.42
14 Verrucomicrobiae 1 1.59 100.00
Order frequency percentage cumulative_perc
1 Clostridiales 30 47.62 47.62
2 Bacteroidales 4 6.35 53.97
3 Lactobacillales 4 6.35 60.32
4 Bacillales 3 4.76 65.08
5 Coriobacteriales 2 3.17 68.25
6 Erysipelotrichales 2 3.17 71.42
7 Actinobacteria_unclassified 1 1.59 73.01
8 Actinomycetales 1 1.59 74.60
9 Bacilli_unclassified 1 1.59 76.19
10 Bacteria_unclassified 1 1.59 77.78
11 Bacteroidia_unclassified 1 1.59 79.37
12 Betaproteobacteriales 1 1.59 80.96
13 Bifidobacteriales 1 1.59 82.55
14 Clostridia_unclassified 1 1.59 84.14
15 Deinococcales 1 1.59 85.73
16 Enterobacteriales 1 1.59 87.32
17 Firmicutes_unclassified 1 1.59 88.91
18 Gammaproteobacteria_unclassified 1 1.59 90.50
19 Mollicutes_RF39 1 1.59 92.09
20 Mollicutes_unclassified 1 1.59 93.68
21 Pseudomonadales 1 1.59 95.27
22 Saccharimonadales 1 1.59 96.86
23 Verrucomicrobiae_unclassified 1 1.59 98.45
24 Xanthomonadales 1 1.59 100.00
Family frequency percentage cumulative_perc
1 Lachnospiraceae 13 20.63 20.63
2 Ruminococcaceae 11 17.46 38.09
3 Clostridiaceae_1 3 4.76 42.85
4 Erysipelotrichaceae 2 3.17 46.02
5 Family_XIII 2 3.17 49.19
6 Lactobacillaceae 2 3.17 52.36
7 Actinobacteria_unclassified 1 1.59 53.95
8 Actinomycetaceae 1 1.59 55.54
9 Bacillales_unclassified 1 1.59 57.13
10 Bacilli_unclassified 1 1.59 58.72
11 Bacteria_unclassified 1 1.59 60.31
12 Bacteroidaceae 1 1.59 61.90
13 Bacteroidales_unclassified 1 1.59 63.49
14 Bacteroidia_unclassified 1 1.59 65.08
15 Bifidobacteriaceae 1 1.59 66.67
16 Clostridia_unclassified 1 1.59 68.26
17 Clostridiales_unclassified 1 1.59 69.85
18 Coriobacteriales_unclassified 1 1.59 71.44
19 Deinococcaceae 1 1.59 73.03
20 Eggerthellaceae 1 1.59 74.62
21 Enterobacteriaceae 1 1.59 76.21
22 Firmicutes_unclassified 1 1.59 77.80
23 Gammaproteobacteria_unclassified 1 1.59 79.39
24 Lactobacillales_unclassified 1 1.59 80.98
25 Listeriaceae 1 1.59 82.57
26 Mollicutes_RF39_fa 1 1.59 84.16
27 Mollicutes_unclassified 1 1.59 85.75
28 Muribaculaceae 1 1.59 87.34
29 Neisseriaceae 1 1.59 88.93
30 Pseudomonadaceae 1 1.59 90.52
31 Rikenellaceae 1 1.59 92.11
32 Saccharimonadaceae 1 1.59 93.70
33 Staphylococcaceae 1 1.59 95.29
34 Streptococcaceae 1 1.59 96.88
35 Verrucomicrobiae_unclassified 1 1.59 98.47
36 Xanthomonadaceae 1 1.59 100.00
Figure x: Unfiltered and curated OTU abundance. Visual representation of taxon terms highlight the most abundant taxon based on frequency of being assigned to an OTU or tree nodes. Muribaculaceae is the most frequently assigned family and Muribaculaceae_ge is the most frequent species assigned to most sequences.
OTUs
[1] "Otu01" "Otu02" "Otu03" "Otu04" "Otu05" "Otu06" "Otu08" "Otu09"
[9] "Otu10" "Otu17"
Phylum
[1] "Bacteroidetes" "Firmicutes"
Class
[1] "Bacteroidia" "Clostridia"
Order
[1] "Bacteroidales" "Clostridiales" "Lactobacillales"
Family
[1] "Bacteroidaceae" "Lachnospiraceae" "Muribaculaceae" "Ruminococcaceae"
Genus
[1] "Alistipes" "Bacteroides" "Lactobacillus"
[4] "Mollicutes_RF39_ge" "Muribaculaceae_ge" "Oscillospira"
[7] "Turicibacter"
Figure x. Rank abundance of eight samples. Package: goeveg
Figure x. Correlation between species identified at phylum-level. Species are ordered alphabetically (top panel) and heuristically (bottom panel)
Figure x: Stacked barplots for species richness. The estimated richness (green bars) was calculated using chao calculator and observed ichness (red bars) was calculated using sobs.
Figure x: Species richness (observed species) displayed by boxplot (A), density plots (B) and histograms (C).
Figure x: Correlation between species richness and sequence depth. Observed species calculated using sobs (A) and estimated species richness by chao calculator (B).
Figure x: Species diversity and correlation to species richness. Definitely phylo-diversity (C) correlates well with the species richness.
Example: Figure10 in iMAP paper
Figure x: Species diversity estimates as a function of sample size. Only species with abundance greater or equal to 1 are detected in the sample.
OTUbased
Number_clusters Value_Index
3.0000 2.0409
Phylum
Number_clusters Value_Index
3.000 4.683
Class
Number_clusters Value_Index
3.0000 4.1005
Order
Number_clusters Value_Index
3.0000 3.6381
Family
Number_clusters Value_Index
2.0000 2.7126
Genus
Number_clusters Value_Index
3.0000 2.4937
Square root transformation
Wisconsin double standardization
Run 0 stress 0.0008337215
Run 1 stress 0.001379944
Run 2 stress 0.001625067
Run 3 stress 0.0006664797
... New best solution
... Procrustes: rmse 0.1037003 max resid 0.1840151
Run 4 stress 0.06125386
Run 5 stress 0.001599348
Run 6 stress 9.586465e-05
... New best solution
... Procrustes: rmse 0.08112953 max resid 0.1397838
Run 7 stress 0.005028158
Run 8 stress 0.001724613
Run 9 stress 9.89946e-05
... Procrustes: rmse 0.08519075 max resid 0.1338683
Run 10 stress 0.001634013
Run 11 stress 9.798934e-05
... Procrustes: rmse 0.1025929 max resid 0.1591433
Run 12 stress 0.001559953
Run 13 stress 0.0003032558
... Procrustes: rmse 0.03095813 max resid 0.06258805
Run 14 stress 0.0005031003
... Procrustes: rmse 0.1147324 max resid 0.1661381
Run 15 stress 0.0009696094
Run 16 stress 0.00164202
Run 17 stress 9.854687e-05
... Procrustes: rmse 0.09427642 max resid 0.1366902
Run 18 stress 0.002092104
Run 19 stress 0.001540382
Run 20 stress 9.519968e-05
... New best solution
... Procrustes: rmse 0.07869783 max resid 0.1100281
*** No convergence -- monoMDS stopping criteria:
14: no. of iterations >= maxit
5: stress < smin
1: stress ratio > sratmax
Square root transformation
Wisconsin double standardization
Run 0 stress 9.885052e-05
Run 1 stress 9.53452e-05
... New best solution
... Procrustes: rmse 0.1065741 max resid 0.1697209
Run 2 stress 9.951807e-05
... Procrustes: rmse 0.07469551 max resid 0.1115015
Run 3 stress 9.745068e-05
... Procrustes: rmse 0.07763297 max resid 0.1373501
Run 4 stress 0.001067049
Run 5 stress 4.372603e-05
... New best solution
... Procrustes: rmse 0.1059044 max resid 0.1494925
Run 6 stress 9.897749e-05
... Procrustes: rmse 0.09481433 max resid 0.1718094
Run 7 stress 7.18088e-05
... Procrustes: rmse 0.1137924 max resid 0.1678205
Run 8 stress 9.928715e-05
... Procrustes: rmse 0.1261378 max resid 0.21985
Run 9 stress 7.839463e-05
... Procrustes: rmse 0.09973258 max resid 0.1540742
Run 10 stress 9.605322e-05
... Procrustes: rmse 0.1003967 max resid 0.1512087
Run 11 stress 9.954273e-05
... Procrustes: rmse 0.1121705 max resid 0.1756348
Run 12 stress 9.687449e-05
... Procrustes: rmse 0.1356559 max resid 0.1636638
Run 13 stress 0.0003731287
... Procrustes: rmse 0.1142354 max resid 0.1680688
Run 14 stress 9.55993e-05
... Procrustes: rmse 0.1003592 max resid 0.1588128
Run 15 stress 9.828561e-05
... Procrustes: rmse 0.1074659 max resid 0.1625574
Run 16 stress 0.06811531
Run 17 stress 9.253481e-05
... Procrustes: rmse 0.1305726 max resid 0.2199027
Run 18 stress 8.978303e-05
... Procrustes: rmse 0.1136534 max resid 0.1680695
Run 19 stress 0.0001783715
... Procrustes: rmse 0.09088008 max resid 0.1386487
Run 20 stress 0.000119464
... Procrustes: rmse 0.1023554 max resid 0.1541552
*** No convergence -- monoMDS stopping criteria:
4: no. of iterations >= maxit
15: stress < smin
1: stress ratio > sratmax
Square root transformation
Wisconsin double standardization
Run 0 stress 0.0001674397
Run 1 stress 0.1079485
Run 2 stress 9.997663e-05
... New best solution
... Procrustes: rmse 0.0928957 max resid 0.1641697
Run 3 stress 9.810815e-05
... New best solution
... Procrustes: rmse 0.02436755 max resid 0.03330625
Run 4 stress 9.746947e-05
... New best solution
... Procrustes: rmse 0.05873104 max resid 0.08644727
Run 5 stress 9.713544e-05
... New best solution
... Procrustes: rmse 0.05144213 max resid 0.07440652
Run 6 stress 8.838194e-05
... New best solution
... Procrustes: rmse 0.0876071 max resid 0.1438818
Run 7 stress 9.933442e-05
... Procrustes: rmse 0.1576782 max resid 0.2247313
Run 8 stress 8.754791e-05
... New best solution
... Procrustes: rmse 0.007195614 max resid 0.01129571
Run 9 stress 0.001114504
Run 10 stress 9.329577e-05
... Procrustes: rmse 0.02776029 max resid 0.04415047
Run 11 stress 0.002764123
Run 12 stress 0.0008529207
Run 13 stress 8.107075e-05
... New best solution
... Procrustes: rmse 0.004852399 max resid 0.006437002
... Similar to previous best
*** Solution reached
Square root transformation
Wisconsin double standardization
Run 0 stress 0.000413453
Run 1 stress 0.09395731
Run 2 stress 0.001073979
Run 3 stress 0.006556755
Run 4 stress 0.004028627
Run 5 stress 0.09395733
Run 6 stress 0.0009820881
Run 7 stress 0.0001767323
... New best solution
... Procrustes: rmse 0.03265705 max resid 0.05066357
Run 8 stress 0.001002464
Run 9 stress 0.004035854
Run 10 stress 0.004368155
Run 11 stress 0.0001296708
... New best solution
... Procrustes: rmse 0.0248724 max resid 0.04081775
Run 12 stress 0.006458073
Run 13 stress 0.006459056
Run 14 stress 0.0002502822
... Procrustes: rmse 0.04356952 max resid 0.07433493
Run 15 stress 0.09395743
Run 16 stress 0.005371769
Run 17 stress 0.0004471322
... Procrustes: rmse 0.0166013 max resid 0.02545187
Run 18 stress 0.002520092
Run 19 stress 0.00153722
Run 20 stress 0.0007879612
*** No convergence -- monoMDS stopping criteria:
17: no. of iterations >= maxit
3: stress ratio > sratmax
Square root transformation
Wisconsin double standardization
Run 0 stress 0.008008543
Run 1 stress 0.02935725
Run 2 stress 0.02935702
Run 3 stress 0.007553667
... New best solution
... Procrustes: rmse 0.08707788 max resid 0.1536088
Run 4 stress 0.02935692
Run 5 stress 0.02935789
Run 6 stress 0.008318727
Run 7 stress 0.02935669
Run 8 stress 0.02935759
Run 9 stress 0.007547803
... New best solution
... Procrustes: rmse 0.0006652656 max resid 0.001103211
... Similar to previous best
Run 10 stress 0.008344215
*** Solution reached
Square root transformation
Wisconsin double standardization
Run 0 stress 0.01811782
Run 1 stress 0.01986519
Run 2 stress 0.02006416
Run 3 stress 0.01837826
... Procrustes: rmse 0.01528503 max resid 0.02694752
Run 4 stress 0.01986769
Run 5 stress 0.02027452
Run 6 stress 0.01811644
... New best solution
... Procrustes: rmse 0.0002576534 max resid 0.0004266533
... Similar to previous best
Run 7 stress 0.01811681
... Procrustes: rmse 0.0001101164 max resid 0.0001867403
... Similar to previous best
Run 8 stress 0.01854509
... Procrustes: rmse 0.02299282 max resid 0.04127708
Run 9 stress 0.06108471
Run 10 stress 0.01954551
*** Solution reached
OTUs
----------------------------
Call:
metaMDS(comm = otu.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(otu.t))
Distance: bray
Dimensions: 3
Stress: 9.519968e-05
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(otu.t))'
Phylum
----------------------------
Call:
metaMDS(comm = phylum.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(phylum.t))
Distance: bray
Dimensions: 3
Stress: 4.372603e-05
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(phylum.t))'
Class
----------------------------
Call:
metaMDS(comm = class.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(class.t))
Distance: bray
Dimensions: 3
Stress: 8.107075e-05
Stress type 1, weak ties
Two convergent solutions found after 13 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(class.t))'
Order
----------------------------
Call:
metaMDS(comm = order.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(order.t))
Distance: bray
Dimensions: 3
Stress: 0.0001296708
Stress type 1, weak ties
No convergent solutions - best solution after 20 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(order.t))'
Family
----------------------------
Call:
metaMDS(comm = family.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(family.t))
Distance: bray
Dimensions: 3
Stress: 0.007547803
Stress type 1, weak ties
Two convergent solutions found after 10 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(family.t))'
Genus
----------------------------
Call:
metaMDS(comm = genus.t, distance = "bray", k = 3, try = 10, display = c("sites"), choices = c(1, 2), type = "t", shrink = FALSE)
global Multidimensional Scaling using monoMDS
Data: wisconsin(sqrt(genus.t))
Distance: bray
Dimensions: 3
Stress: 0.01811644
Stress type 1, weak ties
Two convergent solutions found after 10 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on 'wisconsin(sqrt(genus.t))'
Figure X. Sherperd and non-metric multidimensional scaling plot. Green oints represent samples and red points represent OTU or species. Similar samples are ordinated together. Stress values are shown at the botthom of ordination plot.
Figure x: Sample Phylip or Newick-formatted tree clustered using the UPGMA (Unweighted Pair Group Method with Arithmetic Mean) algorithm. Similar data was used to construct different types of tree including rectangular (A), circular (B) and unrooted (C). The tree is annotated with sequence depth, species richness, and phylum abundance as seen in (D).
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.3
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] NbClust_3.0 reshape2_1.4.3 iNEXT_2.0.19 ggpubr_0.2
[5] magrittr_1.5 goeveg_0.4.2 vegan_2.5-4 permute_0.9-4
[9] funModeling_1.7 Hmisc_4.2-0 Formula_1.2-3 survival_2.43-3
[13] lattice_0.20-38 gridExtra_2.3 scales_1.0.0 ggplot2_3.1.0
loaded via a namespace (and not attached):
[1] maps_3.3.0 splines_3.5.2 dotCall64_1.0-0
[4] gtools_3.8.1 moments_0.14 assertthat_0.2.0
[7] latticeExtra_0.6-28 pander_0.6.3 yaml_2.2.0
[10] corrplot_0.84 pillar_1.3.1 backports_1.1.3
[13] glue_1.3.0 digest_0.6.18 RColorBrewer_1.1-2
[16] checkmate_1.9.1 colorspace_1.4-0 cowplot_0.9.4
[19] htmltools_0.3.6 Matrix_1.2-15 plyr_1.8.4
[22] pkgconfig_2.0.2 purrr_0.3.1 gdata_2.18.0
[25] htmlTable_1.13.1 tibble_2.0.1 mgcv_1.8-27
[28] withr_2.1.2 ROCR_1.0-7 nnet_7.3-12
[31] lazyeval_0.2.1 crayon_1.3.4 evaluate_0.13
[34] nlme_3.1-137 MASS_7.3-51.1 gplots_3.0.1.1
[37] foreign_0.8-71 tools_3.5.2 data.table_1.12.0
[40] hms_0.4.2 stringr_1.4.0 munsell_0.5.0
[43] cluster_2.0.7-1 entropy_1.2.1 compiler_3.5.2
[46] caTools_1.17.1.1 rlang_0.3.1 rstudioapi_0.9.0
[49] htmlwidgets_1.3 spam_2.2-1 bitops_1.0-6
[52] base64enc_0.1-3 labeling_0.3 rmarkdown_1.11
[55] gtable_0.2.0 R6_2.4.0 knitr_1.21
[58] dplyr_0.8.0.1 readr_1.3.1 KernSmooth_2.23-15
[61] stringi_1.3.1 parallel_3.5.2 Rcpp_1.0.0
[64] fields_9.6 rpart_4.1-13 acepack_1.4.1
[67] tidyselect_0.2.5 xfun_0.5